AITopics

Country:

North America > United States > Illinois > Champaign County > Urbana (0.14)
North America > Canada (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)

Neural Information Processing SystemsDec-25-2025, 06:11:17 GMT

Universal Approximation of Input-Output Maps by Temporal Convolutional Nets

There has been a recent shift in sequence-to-sequence modeling from recurrent network architectures to convolutional network architectures due to computational advantages in training and operation while still achieving competitive performance. For systems having limited long-term temporal dependencies, the approximation capability of recurrent networks is essentially equivalent to that of temporal convolutional nets (TCNs). We prove that TCNs can approximate a large class of input-output maps having approximately finite memory to arbitrary error tolerance. Furthermore, we derive quantitative approximation rates for deep ReLU TCNs in terms of the width and depth of the network and modulus of continuity of the original input-output map, and apply these results to input-output maps of systems that admit finite-dimensional state-space realizations (i.e., recurrent models).

input-output map, temporal convolutional net, universal approximation, (4 more...)

Technology: Information Technology > Artificial Intelligence (0.42)

Joshua Hanson, Maxim Raginsky

Universal Approximation of Input-Output Maps by Temporal Convolutional Nets

Neural Information Processing SystemsOct-2-2025, 13:26:06 GMT

Neural Information Processing Systems http://nips.cc/

assumption 4, machine learning, natural language, (20 more...)

Country: North America > United States > Illinois > Champaign County > Urbana (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)

Neural Information Processing SystemsOct-9-2024, 20:32:37 GMT

Universal Approximation of Input-Output Maps by Temporal Convolutional Nets

There has been a recent shift in sequence-to-sequence modeling from recurrent network architectures to convolutional network architectures due to computational advantages in training and operation while still achieving competitive performance. For systems having limited long-term temporal dependencies, the approximation capability of recurrent networks is essentially equivalent to that of temporal convolutional nets (TCNs). We prove that TCNs can approximate a large class of input-output maps having approximately finite memory to arbitrary error tolerance. Furthermore, we derive quantitative approximation rates for deep ReLU TCNs in terms of the width and depth of the network and modulus of continuity of the original input-output map, and apply these results to input-output maps of systems that admit finite-dimensional state-space realizations (i.e., recurrent models).

input-output map, temporal convolutional net, universal approximation, (2 more...)

Technology: Information Technology > Artificial Intelligence (0.47)

Hernández, Martín, Zuazua, Enrique

Deep Neural Networks: Multi-Classification and Universal Approximation

arXiv.org Machine LearningSep-10-2024

We demonstrate that a ReLU deep neural network with a width of $2$ and a depth of $2N+4M-1$ layers can achieve finite sample memorization for any dataset comprising $N$ elements in $\mathbb{R}^d$, where $d\ge1,$ and $M$ classes, thereby ensuring accurate classification. By modeling the neural network as a time-discrete nonlinear dynamical system, we interpret the memorization property as a problem of simultaneous or ensemble controllability. This problem is addressed by constructing the network parameters inductively and explicitly, bypassing the need for training or solving any optimization problem. Additionally, we establish that such a network can achieve universal approximation in $L^p(\Omega;\mathbb{R}_+)$, where $\Omega$ is a bounded subset of $\mathbb{R}^d$ and $p\in[1,\infty)$, using a ReLU deep neural network with a width of $d+1$. We also provide depth estimates for approximating $W^{1,p}$ functions and width estimates for approximating $L^p(\Omega;\mathbb{R}^m)$ for $m\geq1$. Our proofs are constructive, offering explicit values for the biases and weights involved.

hyperplane, hyperrectangle, neural network, (14 more...)

2409.06555

Country:

Europe > Spain > Galicia > Madrid (0.04)
Europe > Spain > Basque Country > Biscay Province > Bilbao (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
(2 more...)

Genre:

Workflow (1.00)
Research Report (0.63)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Datar, Chinmay, Datar, Adwait, Dietrich, Felix, Schilders, Wil

Systematic construction of continuous-time neural networks for linear dynamical systems

arXiv.org Artificial IntelligenceMar-24-2024

Discovering a suitable neural network architecture for modeling complex dynamical systems poses a formidable challenge, often involving extensive trial and error and navigation through a high-dimensional hyper-parameter space. In this paper, we discuss a systematic approach to constructing neural architectures for modeling a subclass of dynamical systems, namely, Linear Time-Invariant (LTI) systems. We use a variant of continuous-time neural networks in which the output of each neuron evolves continuously as a solution of a first-order or second-order Ordinary Differential Equation (ODE). Instead of deriving the network architecture and parameters from data, we propose a gradient-free algorithm to compute sparse architecture and network parameters directly from the given LTI system, leveraging its properties. We bring forth a novel neural architecture paradigm featuring horizontal hidden layers and provide insights into why employing conventional neural architectures with vertical hidden layers may not be favorable. We also provide an upper bound on the numerical errors of our neural networks. Finally, we demonstrate the high accuracy of our constructed networks on three numerical examples.

lti system, matrix, neural network, (15 more...)

arXiv.org Artificial Intelligence

2403.16215

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
North America > Canada (0.04)
Europe > Netherlands > North Brabant > Eindhoven (0.04)
(2 more...)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Ho, Lam Si Tung, Dinh, Vu

Searching for Minimal Optimal Neural Networks

arXiv.org Machine LearningSep-27-2021

Large neural network models have high predictive power but may suffer from overfitting if the training set is not large enough. Therefore, it is desirable to select an appropriate size for neural networks. The destructive approach, which starts with a large architecture and then reduces the size using a Lasso-type penalty, has been used extensively for this task. Despite its popularity, there is no theoretical guarantee for this technique. Based on the notion of minimal neural networks, we posit a rigorous mathematical framework for studying the asymptotic theory of the destructive technique. We prove that Adaptive group Lasso is consistent and can reconstruct the correct number of hidden nodes of one-hidden-layer feedforward networks with high probability. To the best of our knowledge, this is the first theoretical result establishing for the destructive technique.

neural network, node, probability, (14 more...)

2109.13061

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > Canada > Nova Scotia > Halifax Regional Municipality > Halifax (0.04)

Genre:

Research Report (0.50)
Workflow (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Hanson, Joshua, Raginsky, Maxim

Universal Approximation of Input-Output Maps by Temporal Convolutional Nets

Neural Information Processing SystemsMar-19-2020, 02:30:50 GMT

There has been a recent shift in sequence-to-sequence modeling from recurrent network architectures to convolutional network architectures due to computational advantages in training and operation while still achieving competitive performance. For systems having limited long-term temporal dependencies, the approximation capability of recurrent networks is essentially equivalent to that of temporal convolutional nets (TCNs). We prove that TCNs can approximate a large class of input-output maps having approximately finite memory to arbitrary error tolerance. Furthermore, we derive quantitative approximation rates for deep ReLU TCNs in terms of the width and depth of the network and modulus of continuity of the original input-output map, and apply these results to input-output maps of systems that admit finite-dimensional state-space realizations (i.e., recurrent models). Papers published at the Neural Information Processing Systems Conference.

input-output map, temporal convolutional net, universal approximation, (2 more...)

Technology: Information Technology > Artificial Intelligence (0.74)

Hanson, Joshua, Raginsky, Maxim

Universal Approximation of Input-Output Maps by Temporal Convolutional Nets

arXiv.org Machine LearningJun-21-2019

There has been a recent shift in sequence-to-sequence modeling from recurrent network architectures to convolutional network architectures due to computational advantages in training and operation while still achieving competitive performance. For systems having limited long-term temporal dependencies, the approximation capability of recurrent networks is essentially equivalent to that of temporal convolutional nets (TCNs). We prove that TCNs can approximate a large class of input-output maps having approximately finite memory to arbitrary error tolerance. Furthermore, we derive quantitative approximation rates for deep ReLU TCNs in terms of the width and depth of the network and modulus of continuity of the original input-output map, and apply these results to input-output maps of systems that admit finite-dimensional state-space realizations (i.e., recurrent models).

artificial intelligence, machine learning, natural language, (18 more...)

1906.09211

Country: North America > United States > Illinois (0.28)

Genre: Research Report (0.51)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.69)

Raghu, Maithra, Poole, Ben, Kleinberg, Jon, Ganguli, Surya, Sohl-Dickstein, Jascha

Survey of Expressivity in Deep Neural Networks

arXiv.org Machine LearningNov-24-2016

We survey results on neural network expressivity described in "On the Expressive Power of Deep Neural Networks". The paper motivates and develops three natural measures of expressiveness, which all display an exponential dependence on the depth of the network. In fact, all of these measures are related to a fourth quantity, trajectory length. This quantity grows exponentially in the depth of the network, and is responsible for the depth sensitivity observed. These results translate to consequences for networks during and after training. They suggest that parameters earlier in a network have greater influence on its expressive power -- in particular, given a layer, its influence on expressivity is determined by the remaining depth of the network after that layer. This is verified with experiments on MNIST and CIFAR-10. We also explore the effect of training on the input-output map, and find that it trades off between the stability and expressivity.

artificial intelligence, machine learning, trajectory length, (14 more...)

1611.08083

Genre: Research Report (0.41)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)